Author name recognition in degraded journal images
Identifieur interne : 001181 ( Main/Exploration ); précédent : 001180; suivant : 001182Author name recognition in degraded journal images
Auteurs : Aliette De Bodard De La Jacopiere [France] ; Laurence Likforman-Sulem [France]Source :
- Proceedings of SPIE, the International Society for Optical Engineering [ 0277-786X ] ; 2006.
Descripteurs français
- Pascal (Inist)
- Wicri :
- topic : Base de données.
English descriptors
- KwdEn :
Abstract
A method for extracting names in degraded documents is presented in this article. The documents targeted are images of photocopied scientific journals from various scientific domains. Due to the degradation, there is poor OCR recognition, and pieces of other articles appear on the sides of the image. The proposed approach relies on the combination of a low-level textual analysis and an image-based analysis. The textual analysis extracts robust typographic features, while the image analysis selects image regions of interest through anchor components. We report results on the University of Washington benchmark database.
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream PascalFrancis, to step Corpus: 000336
- to stream PascalFrancis, to step Curation: 000450
- to stream PascalFrancis, to step Checkpoint: 000351
- to stream Main, to step Merge: 001212
- to stream Main, to step Curation: 001181
Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">Author name recognition in degraded journal images</title>
<author><name sortKey="De Bodard De La Jacopiere, Aliette" sort="De Bodard De La Jacopiere, Aliette" uniqKey="De Bodard De La Jacopiere A" first="Aliette" last="De Bodard De La Jacopiere">Aliette De Bodard De La Jacopiere</name>
<affiliation wicri:level="3"><inist:fA14 i1="01"><s1>GET-Ecole Nationale Supérieure des Télécommunications Signal and Image Processing Department, 46 rue Barrault</s1>
<s2>75013 Paris</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName><region type="region" nuts="2">Île-de-France</region>
<settlement type="city">Paris</settlement>
</placeName>
</affiliation>
</author>
<author><name sortKey="Likforman Sulem, Laurence" sort="Likforman Sulem, Laurence" uniqKey="Likforman Sulem L" first="Laurence" last="Likforman-Sulem">Laurence Likforman-Sulem</name>
<affiliation wicri:level="3"><inist:fA14 i1="01"><s1>GET-Ecole Nationale Supérieure des Télécommunications Signal and Image Processing Department, 46 rue Barrault</s1>
<s2>75013 Paris</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName><region type="region" nuts="2">Île-de-France</region>
<settlement type="city">Paris</settlement>
</placeName>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">07-0376470</idno>
<date when="2006">2006</date>
<idno type="stanalyst">PASCAL 07-0376470 INIST</idno>
<idno type="RBID">Pascal:07-0376470</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000336</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000450</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000351</idno>
<idno type="wicri:doubleKey">0277-786X:2006:De Bodard De La Jacopiere A:author:name:recognition</idno>
<idno type="wicri:Area/Main/Merge">001212</idno>
<idno type="wicri:Area/Main/Curation">001181</idno>
<idno type="wicri:Area/Main/Exploration">001181</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">Author name recognition in degraded journal images</title>
<author><name sortKey="De Bodard De La Jacopiere, Aliette" sort="De Bodard De La Jacopiere, Aliette" uniqKey="De Bodard De La Jacopiere A" first="Aliette" last="De Bodard De La Jacopiere">Aliette De Bodard De La Jacopiere</name>
<affiliation wicri:level="3"><inist:fA14 i1="01"><s1>GET-Ecole Nationale Supérieure des Télécommunications Signal and Image Processing Department, 46 rue Barrault</s1>
<s2>75013 Paris</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName><region type="region" nuts="2">Île-de-France</region>
<settlement type="city">Paris</settlement>
</placeName>
</affiliation>
</author>
<author><name sortKey="Likforman Sulem, Laurence" sort="Likforman Sulem, Laurence" uniqKey="Likforman Sulem L" first="Laurence" last="Likforman-Sulem">Laurence Likforman-Sulem</name>
<affiliation wicri:level="3"><inist:fA14 i1="01"><s1>GET-Ecole Nationale Supérieure des Télécommunications Signal and Image Processing Department, 46 rue Barrault</s1>
<s2>75013 Paris</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName><region type="region" nuts="2">Île-de-France</region>
<settlement type="city">Paris</settlement>
</placeName>
</affiliation>
</author>
</analytic>
<series><title level="j" type="main">Proceedings of SPIE, the International Society for Optical Engineering</title>
<idno type="ISSN">0277-786X</idno>
<imprint><date when="2006">2006</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">Proceedings of SPIE, the International Society for Optical Engineering</title>
<idno type="ISSN">0277-786X</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Database</term>
<term>Degradation</term>
<term>Image analysis</term>
<term>Interest region</term>
<term>Optical character recognition</term>
<term>Pattern recognition</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Analyse image</term>
<term>Dégradation</term>
<term>Reconnaissance optique caractère</term>
<term>Région intérêt</term>
<term>Base donnée</term>
<term>Reconnaissance forme</term>
<term>4230</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr"><term>Base de données</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">A method for extracting names in degraded documents is presented in this article. The documents targeted are images of photocopied scientific journals from various scientific domains. Due to the degradation, there is poor OCR recognition, and pieces of other articles appear on the sides of the image. The proposed approach relies on the combination of a low-level textual analysis and an image-based analysis. The textual analysis extracts robust typographic features, while the image analysis selects image regions of interest through anchor components. We report results on the University of Washington benchmark database.</div>
</front>
</TEI>
<affiliations><list><country><li>France</li>
</country>
<region><li>Île-de-France</li>
</region>
<settlement><li>Paris</li>
</settlement>
</list>
<tree><country name="France"><region name="Île-de-France"><name sortKey="De Bodard De La Jacopiere, Aliette" sort="De Bodard De La Jacopiere, Aliette" uniqKey="De Bodard De La Jacopiere A" first="Aliette" last="De Bodard De La Jacopiere">Aliette De Bodard De La Jacopiere</name>
</region>
<name sortKey="Likforman Sulem, Laurence" sort="Likforman Sulem, Laurence" uniqKey="Likforman Sulem L" first="Laurence" last="Likforman-Sulem">Laurence Likforman-Sulem</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001181 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 001181 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Exploration |type= RBID |clé= Pascal:07-0376470 |texte= Author name recognition in degraded journal images }}
This area was generated with Dilib version V0.6.32. |